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Crystallization and preliminary X-ray diffraction 
analysis of the N-terminal domain of human 
coronavirus OC43 nucleocapsid protein 


The N-terminal domain of nucleocapsid protein from human coronavirus OC43 
(HCoV-OC43 N-NTD) mostly contains positively charged residues and has 
been identified as being responsible for RNA binding during ribonucleocapsid 
formation in the coronavirus. In this study, the crystallization and preliminary 
crystallographic analysis of HCoV-OC43 N-NTD (amino acids 58-195) with a 
molecular weight of 20 kDa are reported. HCoV-OC43 N-NTD was crystallized 
at 293 K using PEG 1500 as a precipitant and a 99.9% complete native data set 
was collected to 1.7 A resolution at 100 K with an overall Rmerge Of 5.0%. The 
crystals belonged to the hexagonal space group P6;, with unit-cell parameters 
a = 81.57, c = 42.87 A. Solvent-content calculations suggest that there is likely to 
be one subunit of N-NTD in the asymmetric unit. 


1. Introduction 


Human coronavirus OC43 (HCoV-OC43) is responsible for approxi- 
mately 20% of all colds (Kaye et al., 1971; Lai & Cavanagh, 1997). 
Although HCoV-OC43 infections are generally mild, more severe 
upper and lower respiratory-tract infections such as bronchiolitis and 
pneumonia have been characterized, especially in infants, elderly 
individuals and immunocompromised patients (El-Sahly et al., 2000; 
Gagneur et al., 2002; St-Jean et al., 2004). According to serological 
cross-reactivity, HCoV-OC43 is a representative of the class II 
coronaviruses. The RNA genomes of coronaviruses comprise of 
several genes that encode several structural and nonstructural 
proteins and are arranged in the order 5’-pol (polymerase)-S (spike)— 
E (envelope)-M (matrix)—N (nucleocapsid)-3’ (Navas-Martin & 
Weiss, 2004). The virion envelope surrounding the nucleocapsid 
contains the structural proteins S, M, E and N. Some of them contain 
a third glycoprotein, HE (haemagglutinin-esterase), which is present 
in most class II coronaviruses. The primary function of the HCoV N 
protein is to recognize a stretch of RNA that serves as a packaging 
signal and leads to the formation of the ribonucleoprotein (RNP) 
complex during assembly (Lai & Cavanagh, 1997). RNP may be 
important in keeping the RNA in an ordered conformation suitable 
for replication and transcription of the viral genome (Lai, 2003; 
Nelson et al., 2000; Huang et al., 2004; Navas-Martin & Weiss, 2004). 
In addition, the N protein is also an important diagnostic marker and 
is the most immunodominant antigen in infected hosts (Chan et al., 
2005; Woo et al., 2004). 

The N protein of HCoV-OC43 has a molecular weight of 49.5 kDa 
and a pI of 10.0 and is a highly basic protein with a high hydrophilicity 
(Hogue & Brian, 1986; Pohl-Koppe et al., 1995). We have previously 
shown that the N-terminal domain of HCoV-OC43 (N-NTD) con- 
tains most of the positively charged residues that are responsible for 
RNA binding, while the C-terminal domain (N-CTD) mainly acts 
as an oligomerization module to form a capsid (Huang et al., 2009; 
Saikatendu et al., 2007; Jayaram et al., 2006; Fan et al., 2005). The 
central disordered region of the N protein has also been shown to 
contain an RNA-binding region (Chang et al., 2009). In order to 
clarify the mechanism by which the N protein of HCoV-OC43 binds 
to nucleic acids, we have undertaken the determination of the crystal 
structure of the N-terminal domain of HCoV-OC43 (residues 
58-195). 
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Table 1 
Data-collection statistics for HCoV-OC43 N-NTD crystals. 


Values in parentheses are for the highest resolution shell. 


X-ray source BL13B1, NSRRC 
Wavelength (A) 1.0 

Space group P65 

Unit-cell parameters (A) @= 615]7,¢= 42.87 
Resolution limits (A) 30-1.70 (1.76-1.70) 


Total reflections 149206 
Unique reflections 18040 
Completeness (%) 99.9 (99.9) 
Redundancy 8.3 (7.7) 
Rierwet (70) 5.0 (76.1) 
(I/o(1)) 27.8 (2.46) 


+ Rmerge = enki > (Ak) — ((hk1))|/ a pee T(hkl). 


2. Experimental methods 
2.1. Expression and purification of HCoV-OC43 N-NTD 


The templates for the HCoV-OC43 N protein were kindly provided 
by the Institute of Biological Chemistry, Academia Sinica (Taipei, 
Taiwan). In order to generate a truncated form of the recombinant 
HCoV-OC43 N-NTD, the gene encoding the N protein was amplified 
by the polymerase chain reaction (PCR) using Pfu polymerase with 
the forward primer 5’-CGCTATGAATTCAATGTTGTACCCTAC- 
TATTCTTGGTTC-3’ and the reverse primer 5’-ACAACGCTCGA- 
GAGCAGACCTTCCTGAGCCTTCAATAT-3’. The PCR products 
were digested with EcoRI and XhoI and the resulting DNA frag- 
ments were cloned into pET28a (Novagen) using T4 ligase (NEB). 
The recombinant plasmid was transformed into Escherichia coli 
BL21-RIL using the heat-shock method. The cells were grown at 
310 K in Luria-Bertani medium containing 50 mg1~' kanamycin. 
Protein expression was induced at an ODg¢o9 of 0.6 by the addition of 
1 mM isopropyl 6-p-1-thiogalactopyranoside (IPTG) at 283 K for 
24h. The expressed N-NTD contains N-terminal (MGSSHHHHH- 
HSSGLVPRGSHMASMTGGOOMGRGSEF) and C-terminal (LE- 
HHHHHH) tags. After harvesting the bacteria by centrifugation 
(6000 rev min~', 30 min, 277K), the bacterial pellets were resus- 
pended in sonication buffer (50 mM Tris-buffered solution pH 7.5, 
150 mM NaCl and 15 mM imidazole) and lysed by sonication for 


37.6 


Zo.) 


Figure 1 

SDS-PAGE analysis of HCoV-OC43 N-NTD stained with Coomassie Brilliant 
Blue. Lane M, protein markers (kDa); lane 1, purified HCoV-OC43 N-NTD; lane 2, 
concentrated HCoV-OC43 N-NTD after dialysis. 
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30 min (3 s pulse on and 6 s pulse off). Soluble proteins were obtained 
from the supernatant by centrifugation (13 000 rev min~', 30 min, 
277 K). N-NTD carrying His, tags at the N-terminus and C-terminus 
was purified using an Ni-NTA column (Novagen) with an elution 
gradient from 15 to 250 mM imidazole in buffer solution (50 mM Tris- 
buffered solution pH 7.5 and 150 mM NaC). Fractions containing 
N-NTD were pooled and collected at 250 mM imidazole and dialyzed 
against 50 mM Tris-buffered solution pH 7.5 containing 150 mM 
NaCl (Fig. 1). The purified N-NTD was concentrated to 8 mg ml! in 
50 mM Tris-HCl pH 7.5 containing 255 mM NaCl prior to crystal- 
lization. 


2.2. Crystallization 


Initial crystallization experiments were set up using the Qiagen 
JCSG+ Suite and PACT Suite crystal screens (Newman, 2005) using 
the sitting-drop vapour-diffusion method in accordance with our 
previously described protocol (Chou & Hou, 2008). Each of the 
crystallization conditions (2 ul) from the screening kits was mixed 
with 1.5 ul purified protein solution (8 mg ml~') and 0.5 pl 40% 
hexanediol at room temperature (~298 K) and equilibrated against 
400 pl solution in the well of a Cryschem plate. The conditions were 
refined over seven cycles and crystals were grown in solution con- 
taining 0.25 M succinic acid—phosphate—glycine (SPG) buffer pH 6.0, 
25% PEG 1500 equilibrated at 293 K against 400 ul precipitation 
solution. The SPG buffer was prepared by mixing succinic acid 
(Sigma), sodium dihydrogen phosphate (Merck) and glycine (Merck) 
in a 2:7:7 molar ratio and was adjusted to pH 6.0 with sodium 
hydroxide (Newman, 2004). The crystals appeared within two weeks 
and the largest crystal grew to dimensions of approximately 200 x 
100 x 100 um (Fig. 2). 


2.3. X-ray data collection and processing 


Crystals were soaked in reservoir solution containing 30%(v/v) 
glycerol as a cryoprotectant prior to flash-cooling in a nitrogen-gas 
stream at 100 K. High-resolution X-ray data were collected using a 
synchrotron-radiation source. A preliminary diffraction image was 
obtained on NSRRC (National Synchrotron Radiation Research 


Figure 2 

Crystals of HCoV-OC43 N-NTD obtained with 25% PEG 1500 as a precipitant at 
pH 6.0 by the sitting-drop vapour-diffusion method. The approximate dimensions 
of the crystal are 200 x 100 x 100 um. 
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Center, Hsinchu, Taiwan) contract beamline BL12B2 at SPring-8 
(Hyogo, Japan) using an ADSC Q210r detector. The complete data 
set was collected on beamline BL13B1 at NSRRC using an ADSC 
Q315r detector. The crystal-to-detector distance was 150 mm. The 
oscillation width and exposure time for each frame were 1° and 10s, 
respectively. Crystallographic data integration and reduction were 
performed using the HKL-2000 program package (Otwinowski & 
Minor, 1997). The crystallographic statistics of data collection for 
N-NTD are listed in Table 1. 


Figure 3 
Typical X-ray diffraction pattern of HCoV-OC43 N-NTD. The arrow shows the 
data at the resolution limit (1.7 A). 
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Figure 4 

Multiple sequence alignment of CoV N-NTDs; T-Coffee (Notredame et al., 2000) 
was used to define conserved residues. The numbering above the sequence is for the 
amino-acid sequence from OC43. Conserved residues are shaded. Fully conserved 
residues are shaded red. Residues that were partially conserved at levels of 75% 
and 50% are shaded orange and yellow, respectively. The amino-acid sequences of 
OC43 (HCoV-OC43; NP_937954), SARS (SARS-CoV; ABI96968), 229E (HCoV- 
229E; AAG48597) and IBV (infectious bronchitis virus; AAB24054) N-NTD were 
obtained from GenBank. 
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3. Results and discussion 


The HCoV-OC43 N-NTD crystal chosen for this study was shown to 
diffract X-rays to 1.7 A resolution (Fig. 3) and belonged to space 
group P6;s, with unit-cell parameters a = 81.57, c = 42.87 A. The 
Matthews coefficient of 2.06 A® Da! calculated using MATTHEWS_ 
COEF (Collaborative Computational Project, Number 4, 1994; 
Matthews, 1968) suggested there was likely to be one molecule in the 
asymmetric unit, with a solvent content of 40.26%. A homology 
search for the HCoV-OC43 N-NTD structure was performed using 
the BLAST server (Altschul et al., 1997; http://blast.ncbi.nlm.nih.gov/ 
Blast.cgi) when the synchrotron data were collected in December 
2008. The sequence-alignment search indicated that HCoV-OC43 
N-NTD shares 30-40% sequence identity with other N-NTDs of 
coronaviruses (Fig. 4) and the N-terminal domain from SARS 
coronavirus (PDB code 2ofz; Saikatendu et al., 2007) was chosen as 
an initial search model owing to the low E value of 1 x 10~*>. The first 
molecular-replacement trial was performed using the automated 
interface PERON at the Protein Tectonics Platform (PTP), RIKEN 
SPring-8 Center, Japan (Sugahara et al., 2008). The best result was 
obtained using the MOLREP program (Vagin & Teplyakov, 2010). A 
single and unambiguous solution for the rotation and translation 
function was obtained with reflections in the resolution range 30- 
3.0 A, and yielded a final correlation coefficient of 0.79 and an R 
factor of 0.44. The core of the model consists of a tightly packed 
B-sheet of five antiparallel strands surrounded by large loops. 
Structural refinement of the HCoV-OC43 N-NTD is currently in 
progress. 
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